本文报道了机器人研究人员的见解,该洞察力参加了由德国卡尔斯鲁赫(Karlsruhe)的Kerntechnische Hilfdienst GmbH(KHG)进行的为期5天的核灾难反应现场演习。德国核工业建立了KHG,为核事故提供了机器人辅助的紧急响应能力。我们对所使用的设备进行系统描述;机器人操作员的培训计划;现场锻炼和机器人任务;练习期间遵循的协议。此外,我们还提供了基于这些观察结果来推进灾难响应机器人技术的见解和建议。具体而言,性能的主要退化来自对操作员的认知和注意力需求。此外,除了易用性外,机器人平台和模块还应旨在保持健壮和可靠。最后,由于紧急响应利益相关者通常对使用自主系统持怀疑态度,因此我们建议采用可变的自主权范式将自主机器人的能力与人类的自主机器人能力逐渐融合在一起。远程操作和自主权之间的这种中间立场可以增加最终用户的接受,同时直接减轻操作员的机器人控制负担并保持人类陆路的弹性。
translated by 谷歌翻译
In large-scale machine learning, recent works have studied the effects of compressing gradients in stochastic optimization in order to alleviate the communication bottleneck. These works have collectively revealed that stochastic gradient descent (SGD) is robust to structured perturbations such as quantization, sparsification, and delays. Perhaps surprisingly, despite the surge of interest in large-scale, multi-agent reinforcement learning, almost nothing is known about the analogous question: Are common reinforcement learning (RL) algorithms also robust to similar perturbations? In this paper, we investigate this question by studying a variant of the classical temporal difference (TD) learning algorithm with a perturbed update direction, where a general compression operator is used to model the perturbation. Our main technical contribution is to show that compressed TD algorithms, coupled with an error-feedback mechanism used widely in optimization, exhibit the same non-asymptotic theoretical guarantees as their SGD counterparts. We then extend our results significantly to nonlinear stochastic approximation algorithms and multi-agent settings. In particular, we prove that for multi-agent TD learning, one can achieve linear convergence speedups in the number of agents while communicating just $\tilde{O}(1)$ bits per agent at each time step. Our work is the first to provide finite-time results in RL that account for general compression operators and error-feedback in tandem with linear function approximation and Markovian sampling. Our analysis hinges on studying the drift of a novel Lyapunov function that captures the dynamics of a memory variable introduced by error feedback.
translated by 谷歌翻译
We investigate data-driven texture modeling via analysis and synthesis with generative adversarial networks. For network training and testing, we have compiled a diverse set of spatially homogeneous textures, ranging from stochastic to regular. We adopt StyleGAN3 for synthesis and demonstrate that it produces diverse textures beyond those represented in the training data. For texture analysis, we propose GAN inversion using a novel latent domain reconstruction consistency criterion for synthesized textures, and iterative refinement with Gramian loss for real textures. We propose perceptual procedures for evaluating network capabilities, exploring the global and local behavior of latent space trajectories, and comparing with existing texture analysis-synthesis techniques.
translated by 谷歌翻译
Functionality and dialogue experience are two important factors of task-oriented dialogue systems. Conventional approaches with closed schema (e.g., conversational semantic parsing) often fail as both the functionality and dialogue experience are strongly constrained by the underlying schema. We introduce a new paradigm for task-oriented dialogue - Dialog2API - to greatly expand the functionality and provide seamless dialogue experience. The conversational model interacts with the environment by generating and executing programs triggering a set of pre-defined APIs. The model also manages the dialogue policy and interact with the user through generating appropriate natural language responses. By allowing generating free-form programs, Dialog2API supports composite goals by combining different APIs, whereas unrestricted program revision provides natural and robust dialogue experience. To facilitate Dialog2API, the core model is provided with API documents, an execution environment and optionally some example dialogues annotated with programs. We propose an approach tailored for the Dialog2API, where the dialogue states are represented by a stack of programs, with most recently mentioned program on the top of the stack. Dialog2API can work with many application scenarios such as software automation and customer service. In this paper, we construct a dataset for AWS S3 APIs and present evaluation results of in-context learning baselines.
translated by 谷歌翻译
This paper proposes an algorithm for motion planning among dynamic agents using adaptive conformal prediction. We consider a deterministic control system and use trajectory predictors to predict the dynamic agents' future motion, which is assumed to follow an unknown distribution. We then leverage ideas from adaptive conformal prediction to dynamically quantify prediction uncertainty from an online data stream. Particularly, we provide an online algorithm uses delayed agent observations to obtain uncertainty sets for multistep-ahead predictions with probabilistic coverage. These uncertainty sets are used within a model predictive controller to safely navigate among dynamic agents. While most existing data-driven prediction approached quantify prediction uncertainty heuristically, we quantify the true prediction uncertainty in a distribution-free, adaptive manner that even allows to capture changes in prediction quality and the agents' motion. We empirically evaluate of our algorithm on a simulation case studies where a drone avoids a flying frisbee.
translated by 谷歌翻译
本文解决了多机器人主动信息采集(AIA)问题,其中一组移动机器人通过基础图进行通信,估计一个表达感兴趣现象的隐藏状态。可以在此框架中表达诸如目标跟踪,覆盖范围和大满贯之类的应用程序。但是,现有的方法要么是不可扩展的,因此无法处理动态现象,或者对通信图中的变化不健全。为了应对这些缺点,我们提出了一个信息感知的图形块网络(I-GBNET),即图形神经网络的AIA适应,该网络将信息通过图表表示,并以分布式方式提供顺序决定。通过基于集中抽样的专家求解器训练通过模仿学习训练的I-GBNET表现出置换量比和时间不变性,同时利用了对以前看不见的环境和机器人配置的卓越可扩展性,鲁棒性和概括性。与训练中看到的相比,隐藏状态和更复杂的环境的实验和更复杂的环境实验验证了所提出的体系结构的特性及其在应用定位和动态目标的应用中的功效。
translated by 谷歌翻译
本文解决了不确定和动态环境中的新语义多机器人计划问题。特别是,环境被不合作,移动,不确定的标记目标占据。这些目标受随机动力学的控制,而它们的当前和未来位置及其语义标签尚不确定。我们的目标是控制移动传感机器人,以便他们可以完成根据这些目标的当前/未来位置和标签定义的协作语义任务。我们使用线性时间逻辑(LTL)表达这些任务。我们提出了一种基于抽样的方法,该方法探讨了机器人运动空间,任务规范空间以及标记目标的未来配置,以设计最佳路径。这些路径在线修订以适应不确定的感知反馈。据我们所知,这是解决不确定和动态语义环境中语义任务计划问题的第一项工作。我们提供了广泛的实验,以证明该方法的效率
translated by 谷歌翻译
这项教程调查概述了统计学习理论中最新的非征血性进步与控制和系统识别相关。尽管在所有控制领域都取得了重大进展,但在线性系统的识别和学习线性二次调节器时,该理论是最发达的,这是本手稿的重点。从理论的角度来看,这些进步的大部分劳动都在适应现代高维统计和学习理论的工具。虽然与控制对机器学习的工具感兴趣的理论家高度相关,但基础材料并不总是容易访问。为了解决这个问题,我们提供了相关材料的独立介绍,概述了基于最新结果的所有关键思想和技术机械。我们还提出了许多开放问题和未来的方向。
translated by 谷歌翻译
共识算法通过使多个机器人能够收敛到仅使用本地通信的全局变量的一致估计来构成许多分布式算法的基础。但是,标准共识协议可以轻松地由非合作团队成员误入歧途。因此,对于设计弹性分布式算法是必要的,对共识的弹性形式的研究是必要的。 W-MSR共识是一种这样的有弹性共识算法,它允许仅具有通信图的本地知识,而没有用于共享数据的先验模型。但是,给定通信图满足严格的图形连接要求的验证使W-MSR在实践中难以使用。在本文中,我们显示了机器人文献中常用的通信图结构,即基于Voronoi Tessellation构建的通信图,自动产生足够连接的图以拒绝单个非合作团队成员。此外,我们展示了如何增强该图,以拒绝两个非合作团队成员,并为修改进一步的弹性提供路线图。这项贡献将允许在已经依赖基于Voronoi的通信(例如分布式覆盖范围和探索算法)的算法中轻松应用弹性共识。
translated by 谷歌翻译
Motivated by the fragility of neural network (NN) controllers in safety-critical applications, we present a data-driven framework for verifying the risk of stochastic dynamical systems with NN controllers. Given a stochastic control system, an NN controller, and a specification equipped with a notion of trace robustness (e.g., constraint functions or signal temporal logic), we collect trajectories from the system that may or may not satisfy the specification. In particular, each of the trajectories produces a robustness value that indicates how well (severely) the specification is satisfied (violated). We then compute risk metrics over these robustness values to estimate the risk that the NN controller will not satisfy the specification. We are further interested in quantifying the difference in risk between two systems, and we show how the risk estimated from a nominal system can provide an upper bound the risk of a perturbed version of the system. In particular, the tightness of this bound depends on the closeness of the systems in terms of the closeness of their system trajectories. For Lipschitz continuous and incrementally input-to-state stable systems, we show how to exactly quantify system closeness with varying degrees of conservatism, while we estimate system closeness for more general systems from data in our experiments. We demonstrate our risk verification approach on two case studies, an underwater vehicle and an F1/10 autonomous car.
translated by 谷歌翻译